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Abstract 

Grid computing is distributed computing performed transparently across multiple 
administrative domains. Grid middleware, which is meant to enable access to grid 
resources, is currently widely seen as being too heavyweight and, in consequence, 
unwieldy for general scientific use. Its heavyweight nature, especially on the client- 
side, has severely restricted the uptake of grid technology by computational scien- 
tists. In this paper, we describe the Application Hosting Environment (AHE) which 
we have developed to address some of these problems. The AHE is a lightweight, 
easily deployable environment designed to allow the scientist to quickly and eas- 
ily run legacy applications on distributed grid resources. It provides a higher level 
abstraction of a grid than is offered by existing grid middleware schemes such as 
the Globus Toolkit. As a result the computational scientist does not need to know 
the details of any particular underlying grid middleware and is isolated from any 
changes to it on the distributed resources. The functionality provided by the AHE 
is 'application-centric': applications are exposed as web services with a well-defined 
standards-compliant interface. This allows the computational scientist to start and 
manage application instances on a grid in a transparent manner, thus greatly simpli- 
fying the user experience. We describe how a range of computational science codes 
have been hosted within the AHE and how the design of the AHE allows us to 
implement complex workflows for deployment on grid infrastructure. 
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1 Introduction 



We define grid computing as distributed computing performed transparently 
across multiple administrative domains PPI2]- By grid computing, we refer 
to any activity involving digital information - computational processing, 
visualisation, data collection from instruments and computational analyses, 
database access, storage and retrieval P - performed by utilising compu- 
tational, visualisation, network and storage resources available on a grid. 
Grid computing has immense potential for computational scientists. A grid 
composed of resources belonging to different institutions has the potential 
to provide a level of service and support that cannot be expected from 
intra-institutional resources or even an existing set of independent resources 
which are not integrated into a grid as such. Some of the benefits possible 
from using grid infrastructure include capacity for increased workload 
volumes, faster response times/turn- around times for the user, and increasing 
the frequency of running analyses. Integration of resources into a grid 
infrastructure allows for better consolidation and workload management and 
may result in reduction of overall costs for grid resource owners. Grids can 
be used to solve computational science problems that could not be solved 
by a single resource either at all or efficiently, possibly due to compute, 
memory, data or network limitations A grid provides a flexible and 

dynamic infrastructure based on open standards which scales efficiently as 
the number of component resources increases. A general computational grid 
should provide easy access to computational, visualisation, storage, network 
and other resources, enabling computational scientists to pick and choose 
those required to achieve their widely varying scientific objectives. 

Many emerging grids are in operation around the world, for example, 
the UK National Grid Service (NGS) 0, US TeraGrid [6J, Enabling Grids 
for E-sciencE (EGEE) [7] and Distributed European Infrastructure For 
Supercomputing Applications (DEISA) [8J in the EU and the Japanese 
National Research Grid Initiative (NAREGI) [5], which make capability 
for computational processing, data storage and visuahsation on high speed 
networks available to computational scientists. The engagement of scientists 
in grid activities is essential for such grids to mature into genuine production- 
level infrastructure which can deliver the much heralded potential of grid 
computing to the international computational science community. 



2 Motivation for the Application Hosting Environment 

Grid architecture consists of various components including grid middleware, 
application programming interfaces (APIs), software development kits. 
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protocols, services, system software and hardware [10]. In our efforts over the 
past four years to utihse grids to address new challenges in computational 
science and engineering, we have experienced significant barriers, particularly 
when dealing with heavyweight grid middleware [TT] . Transparency, implying 
minimal complexity for computational scientists using grid technology, has 
been missing from existing distributed computing infrastructures that might 
aspire to call themselves grids. In this paper, we present the approach we 
have taken to address and overcome the heavyweight grid middleware problem. 

Grid middleware is the software layer that transforms distributed, het- 
erogeneous resources spanning multiple administrative domains into a 
single, integrated grid so that the heterogeneity and multiplicity of the 
distributed resources is transparent to the user. The user interacts with all 
the heterogeneous resources in the grid through a uniform interface. Thus, 
middleware is a critical component of grid infrastructure and its lack of 
usability has been a major barrier to the uptake of grid technology. Many 
of the current grid middleware solutions can be characterised as what we 
describe as 'heavyweight', that is they display some or all of the following 
features: (i) the grid middleware's client-side software, which is what the 
users/computational scientists have to interact with on their desktops, is 
difficult to install and configure, (ii) has non-negligible dependencies on 
supporting software (iii) requires non-standard ports to be opened within 
client-side firewalls and (iv) the server-side components of the grid middleware 
exhibit the same heavyweight characteristics as the client-side software; in 
most cases the situation is worse on the server-side. As a result, the user 
faces significant obstacles in deploying and using many of the current grid 
middleware solutions. This has led to reluctance amongst many scientists to 
actively embrace grid technology [11]. It is important for the development of 
grid computing as a whole that as many diverse groups of scientists as possible 
begin to use grids; it is crucial that the uptake of grid technologies occurs 
beyond specialized grid-centric projects P^13|14] . While some progress has 
been made in the field of grid middleware technology [T5p6] , the prospect of a 
heterogeneous, on-demand computational grid as ubiquitous as the electrical 
power grid is still a long way off. 

To address these deficiencies, there is now much attention focused on 
'lightweight' middleware solutions [T7|18|19||20] . which attempt to lower 
the barrier of entry for grid users. Our efforts to address these deficiencies 
have been concentrated on the development of the Application Hosting 
Environment (AHE), a lightweight WSRF [2T] compliant, web services based 
environment for hosting scientific applications. The AHE alleviates many 
of the problems posed by heavyweight grid middleware for computational 
scientists, who are among those who stand to benefit the most from grid 
computing. The AHE allows scientists to quickly and easily run unmodified 
application codes on grid resources, managing the transfer of files to and from 
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the grid resources and allowing the user to monitor the status of application 
instances that are run on the grid. In the AHE, we adopt a standards-based 
web services approach to expose scientific applications on the grid as stateful 
web services. 

From a computational scientist's perspective, a web service is simply 
any computational application functionality that is accessible over a network 
as a service, that can be invoked in multiple contexts, possibly to form 
higher-level services. The description of the services provided by the AHE is 
standards-compliant [22] so that they can be invoked using any web services 
compliant client (see Appendix [B] for more details). The AHE design is 
infiuenced by our previous work on WEDS (WSRF-based Environment for 
Distributed Simulation) [18j, a hosting environment designed for operation 
primarily within a single administrative domain. The AHE differs from WEDS 
in that it is designed to operate seamlessly across multiple administrative 
domains, in a true grid sense. 



3 Design of the Application Hosting Environment 

The AHE is an ensemble of programs written in Perl with Java-based 
command-line and Graphical User Interface (GUI) client tools. The purpose 
of the AHE is to provide a mechanism for deploying applications onto 
computational grids that makes it easy for the scientist to start and manage 
applications on the grid. The AHE has a client-server architecture in which 
as much of the complexity as possible is moved to the AHE server. This 
makes the AHE client thinner, with a reduced set of software dependencies, 
and easy to install. Our approach works better than the conventional and 
relatively inflexible portal approach as the AHE client is not constrained to 
being a web browser. The AHE client has been implemented as a desktop GUI 
application as well as an interoperating set of command-line tools, making it 
highly flexible and powerful as a tool for constructing lightweight workflows 
via scripting as described in Section O Furthermore, all state persistence 
occurs on the AHE server, which substantially increases client mobility. 

To ensure that the AHE client is easy to install, conflgure and use, 
whilst providing maximum functionality, a number of design constraints were 
set on the overall design of the AHE: 

(1) We do not require the user to install Globus [U], Unicore [IB] (or any 
other grid middleware) clients on his/her machine even if the grid, that 
he wishes to run applications on, uses such grid middleware. 

(2) We assume that the client device uses NAT (Network Address Translation 
[23]) and that the device is flrewalled to only allow outgoing connections. 
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This effectively means tliat tlie client does not accept any inbound con- 
nections: all communication is out-bound and initiated by the client. 

(3) For maximum portability, we require that the client only supports HTTP 
[21], HTTPS [2g and SOAP [2B]. 

(4) The client does not have to be installed on a single machine; the user can 
move between clients on different machines and access the applications 
that have been launched from any one of them. The user can even employ 
a combination of different clients - for example, a command line client to 
launch an application and a GUI client to monitor it. The client therefore 
must maintain no information about an application instance's state. All 
state information is maintained as a central service on the AHE server 
that is queried by the AHE client. 

(5) The client machine needs to be able to upload input files to and download 
output files from a grid resource, but we assume it does not have GridFTP 
client software installed. A WebDAV-based intermediate file staging area 
is therefore used to perform file-staging between the client and the target 
grid resource. 

(6) The AHE client maintains no knowledge of the location of the application 
on the target grid resource, or how it should be run, and it maintains no 
information on specific environment variables that need to be set. 

(7) The client should not be affected by changes to a remote grid resource, 
for example if its underlying middleware changes to a newer version of 
the Globus Toolkit [T51I271 . 

(8) All communication is secured using Transport Layer Security (TLS) [2B] 
with the user's X.509 certificate used for encryption, authentication 
and authorization. 

These design constraints have led us to an architecture in which the AHE 
client is extremely lightweight and simple to deploy, with no dependencies 
except for the Java Runtime Environment [30]. It should be noted that this 
design does not remove the need for middleware solutions such as Globus 
or Unicore on the grid resource (which may be, for example, a US TeraGrid 
site or a supercomputer on the DEISA grid); indeed, we provide an interface 
to run applications on multiple resources with different underlying grid 
middlewares, so it is essential that the grid resource provides a supported 
middleware installation on their machines. The AHE uses GridSAM [31], a 
job submission and monitoring software, as its interface to the grid resources. 
GridSAM provides the AHE server with a uniform interface to the different 
type of grid middleware installed on a grid. See Section [B] and Figure IB. II 
for a brief discussion of GridSAM's role in the AHE architecture. The AHE 
removes the requirement to install any other middleware on the user's client 
machine: the user simply needs to install the lightweight AHE client to 
interact with the grid resources which may have heavyweight middleware 
installed on them. The AHE client currently provides a uniform interface to 
grid resources with installations of Globus Toolkit 2.4.3 [15], Grid Engine 
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6.0 u4+ [32], Condor v6.4, 6.6 and 6.7 [33] and Unicore [16] while work is 
in progress to enable interfacing to Globus Toolkit 4.0 [27] based grids. The 
recently developed Unicore support for AHE fulfils demand for the capability 
to launch applications on the EU DEISA grid [8j and brings with it the much 
sought after interoperability between Unicore-based grids and Globus-based 
grids. 

In addition to the state information about all the application instances, the 
AHE server also stores all the necessary information about how an application 
should be run on the various computational grid resources and provides a 
single, uniform interface to the AHE client for running that application across 
all possible grid resources. This is particularly useful for applications that 
run on supercomputers which often have unique deployment scenarios and 
require special runtime environments. Storing this deployment /configuration 
information on the AHE server as a central application-specific service is an 
efficient mechanism for making the information available and useful to a large 
number of users at once. 

The AHE design is based around the notion that very often a group of 
researchers will want to run the same application, but not all of them will 
possess the skill or inclination to install the application on a remote set of 
grid resources. In the AHE, we therefore distinguish between experts and end- 
users. An expert user installs the application and configures the AHE server, 
so that all participating users can share the same application on the grid at 
large. The end-user simply needs to download and install the lightweight AHE 
client and is then able to trivially access such 'centrally installed' applications. 

The design of the AHE is novel in its application-centric approach, ac- 
cording to which we treat the computational 'application' as a higher level 
entity than a computational 'job'. Computational steering applications 
[MIES], coupled model simulations [551I?F] and workfiows [55)1391140] are cases 
where applications and jobs can be clearly distinguished. In the initial AHE 
release, applications and jobs stand mainly in a one-to-one relationship except 
for the case of workfiow applications. Complex workfiow applications have 
been deployed on grids using the AHE wherein the workfiows are composed 
of simpler computational jobs. This is discussed in more detail in Section O 
By providing a service that will launch a particular application rather than a 
generic computational job it is possible to reduce the complexity of the chent 
and make the scientist's life easier. The scientist can then concentrate on 
science rather than spending time understanding and installing middleware 
and managing individual jobs. 
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4 Application Hosting 



An application is said to be grid-enabled when it is able to run on multiple 
heterogeneous resources comprising a grid. In the computational science 
domain, the AHE provides a mechanism to expose an unmodified application 
as a standards-compliant web service. Appendix |A] and [B] provide more 
technical detail about how this is done. 

When an application is launched on a grid resource using the AHE, 
that instance of the application is referred to as the application instance. In 
other words, the user can launch many application instances for an application 
which is hosted in the AHE. Any kind of modelling and simulation application 
can be hosted within the AHE. To the best of our knowledge, all of the 
applications currently hosted in the AHE are simulation codes as discussed 
in Section m We therefore use the term simulation instead of application 
instance in much of the following discussion of the AHE. However, the same 
discussion is applicable, more generally, to instances of any other type of 
application hosted in the AHE. 

We currently host a number of scientific applications within the AHE 
including the highly scalable molecular dynamics (MD) codes DL_POLY 
3.01 gl], NAMD LAMMPS ^ and GROMACS ^ and the lattice- 
Boltzmann (LB) code, LB3D [3|45H46j . Scientists can use the AHE client 
to launch these applications on a grid; in particular, we currently use the 
AHE to run these applications on the UK NGS and US TeraGrid resources. 
The AHE does not require the application to be modified as long as the 
application is reasonably well-behaved with respect to the run parameter 
specification, file-naming and file-location conventions. 

We next describe, how to run such applications on a grid using the 
AHE. Most of the steps involved are common to all applications. Then we 
discuss some of the specific applications currently hosted in the AHE. Lastly, 
we describe how the AHE client and server can be configured to host a new 
application. 

4.1 Running hosted applications 

The steps involved in running an application instance on a grid using the 
AHE are listed below. We present the steps in terms of running application 
instances of simulation type applications. The AHE GUI client is implemented 
in the familiar 'wizard' fashion; each step in the launching process is presented 
to the user as a separate screen in the GUI with controls to navigate between 
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screens: 

(1) In the AHE client wizard, the user first specifies the particular application 
that he wishes to run, e.g., DL_POLY, and other constraints such as the 
number of processors on which to run it, maximum wall time required 
and so on. The AHE server returns a list of grid resources on which the 
application is installed and that match the constraints. The user then 
selects the grid resource on which he wants to run the simulation. 

(2) Next, the AHE client wizard prompts the user to specify the location of 
the input files and the names of the output files that would be produced at 
the end of the simulation. In the most general case, the user can manually 
specify the locations of input files and the names of output files. However, 
the AHE chent has a plug-in parser feature, whereby application-specific 
plug-in parsers can be integrated into the client automating the file man- 
agement operations. This is discussed in more detail in Sections 14.21 and 

(3) The user then uses the AHE client wizard to start the simulation on the 
grid resource. 

(4) Once the simulation has started, the user can manually check the simu- 
lation status or choose either to set the AHE client to poll the status of 
the simulation or to shut-down the client and return at a later time to re- 
trieve the simulation status. The user can use any machine with an AHE 
client installation, not necessarily the one from which the simulation was 
started to monitor his/her simulations. This is ideal in situations where 
the user would like to access the simulation state, including the input and 
output files, from different machines. 

(5) Apart from monitoring the status of the simulation, users can also ter- 
minate their simulation before normal completion using the AHE client. 

(6) Finally, when the simulation has completed, the user can transfer all the 
input and output files on the remote grid resource at the click of a button. 
The task of recovering output files scattered around a (global) grid has 
been very tedious until now and this feature of the AHE has greatly 
enhanced the productivity of grid users. 

(7) The user may then destroy all memory of the simulation on the AHE 
server or can allow the simulation state to persist on the AHE server to 
review it in the future. Note that a review may involve retrieving the 
simulation input and output files at a later time. 

The combination of the capability to parse configuration files in order to dis- 
cover input and output files, to automatically stage the files to and from the 
grid resources, and to review the state of the simulation including associ- 
ated files long after the simulation has finished, makes the AHE an extremely 
powerful tool for addressing the challenge of solving the provenance problem 
especially when one wishes to run a large number of application instances 
distributed across a grid. 
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4-2 Example applications 



DL_POLY 

DL_POLY [5T] is a parallel molecular dynamics package for simulations of 
macromolecules, polymers, ionic systems and solutions. We host DL_POLY 
within the AHE for users who wish to run molecular dynamics (MD) 
simulations with DL_POLY on the UK NGS core nodes and the HPCx facility 
[17]. This makes it trivial to launch DL_POLY simulations on these grid 
resources using the AHE client wizard. 

A DL_POLY specific plug-in parser has been integrated into the AHE 
client, so that the user simply needs to specify the location of the CONFIG, 
file on his/her local machine and all the input files, such as CONTROL, 
FIELD and/or TABLE, etc., automatically get staged to the remote grid 
resource at the start of the simulation. 

Once the simulation has terminated, the user can, at the click of a 
button, retrieve all the input and output files from his/her DL_POLY 
simulations, for example, the OUTPUT, REVCON, REVIVE, STATIS files 
to the local machine. 

NAMD and LAMMPS 

We host NAMD and LAMMPS within the AHE for users who wish to run 
MD simulation with these applications on the UK NGS core nodes and US 
TeraGrid sites. NAMD [l2j is a parallel molecular dynamics code primarily 
used for large-scale bio-molecular simulations. LAMMPS [13] is a parallel 
molecular dynamics code with optimizations for long-range interactions. It 
is trivial to launch NAMD and LAMMPS simulations using the AHE client 
wizard by following the steps described previously. 

Plug- in parsers for NAMD and LAMMPS have already been integrated 
into the AHE client and are supplied with the AHE client download. These 
plugin-parsers allow the user to specify the location of the NAMD/LAMMPS 
configuration file from which the AHE automatically discovers and transfers 
all the input and output files that the application instance will consume and 
produce. 

For example, the NAMD/LAMMPS configuration file along with the 
force-field paramete file, co-ordinate file, velocity file, and restart files that 
may be required to run the simulation are automatically staged to the grid 
resource at the start of the simulation. Since the simulation is run within a 
single working directory on the remote grid resource, any relative paths in the 
configuration files are removed. All input and output files that belong to a 
particular simulation are located in a uniquely associated working directory. 
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Once the simulation has finished the user can again, at the chck of a button, 
retrieve all the output files from his/her simulation working directory on the 
remote grid resource. 

GROMACS 

GROMACS IS] is a highly optimized, parallel molecular dynamics simulation 
package extensively used for biomolecular simulations. We host GROMACS 
within the AHE for users who wish to run MD simulations with GROMACS 
on the UK NGS core nodes. A client parser plug-in makes it trivial to launch 
the application using either the AHE GUI or command-line clients. 

GROMACS differs from most of the other applications currently hosted in the 
AHE in that it consists of two separate applications, namely a preprocessor 
(grompp) and the actual simulation code (mdrun). A Perl script has been 
created which runs on the grid resource and directs the output of the prepro- 
cessor to the input of the simulation code. This script is then hosted in the 
AHE and treated as a single application. A client configuration parser has also 
been created which takes its input from a 'meta' configuration file, allowing 
the user to specify parameters for both components of the application. 

Once complete, the AHE client stages back both the output from the simula- 
tion code and certain important intermediate files created by the preprocessor. 

LB3D 

LB3D [l6] is a massively parallel implementation of the lattice-Boltzmann 
model for amphiphilic fiuid dynamics |3il45j which is able to reproduce the mor- 
phological and rheological phenomena observed in ternary amphiphilic mix- 
tures from purely bottom-up mesoscopic interactions. We have hosted LB3D 
within the AHE to run lattice-Boltzmann simulations on the UK NGS and US 
TeraGrid nodes. In this case, it was necessary for us to modify the application, 
as some features of the code were incompatible with the AHE design. LB3D 
produces output files whose names contain a randomly generated string while 
relying also on a pre-exisiting output directory at the start of the simulation. 
In order to be easily hosted within the AHE, it is preferable to know in ad- 
vance the names of the output files that the code will generate during the 
execution so as to automate the process of output file retrieval. Moreover, in 
the AHE design, each independent execution of an application is associated 
with a unique working directory on the remote grid resource, within which 
the simulation is run and the staging and de-staging of input and output files 
occurs. It is, therefore, desirable for the application code to run within a single 
working directory. In grid environments, where because of the multiplicity of 
resources, one needs to keep track of output dispersed across various remote 
grid resources, it is particularly useful to have working directories associated 
with specific simulations. 
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4-3 Hosting a new application 

To host an application within the AHE, the expert-user needs to configure the 
AHE server with the following application-specific information: 

(1) location of the application executable on the grid resources, e.g., NGS 
nodes and TeraGrid sites. 

(2) any environment variables that may be required to run the application 
on the different grid resources such as the path to dynamically linked 
libraries 

The reader should consult the AHE server installation guide |18] for the exact 
location where this information needs to be stored. After configuring the 
AHE server with the above information, running a short script provided with 
the server distribution updates the registry of hosted applications to include 
the new application. There is no need to restart the AHE server once a new 
application has been added in this manner. 

Once the above changes have been made to the server-side, the user is 
able to run the application using the AHE client in the generic mode, i.e. 
the AHE client does not require any modification when a new application 
is hosted. However, it is possible to write an application-specific plug-in 
parser for the AHE client which can parse the user-specified application 
configuration file to automatically discover the input and output files for the 
particular application instance that the user wishes to launch. The parsing 
capability allows the user to simply specify the location of the configuration 
file; the AHE client then automatically parses it to find out all the input 
file locations and output file names. Thus, when the simulation is started 
the input files are automatically staged to the remote grid resource and the 
output files are retrieved at the end of the simulation. Details of writing and 
integrating such a plug-in parser with the AHE client are specified in the 
AHE client user guide |I9] . 



5 Scientific Workflows 

In addition to the GUI, the AHE client includes a closed set of atomic 
command-line tools that replicate the essential operations in the AHE GUI 
client. For example, there are AHE command-line tools to 'prepare' an appli- 
cation instance for running on a grid resource, to 'start', 'monitor', 'terminate' 
and 'getoutput (files)'. Complex workfiows composed of multiple simulations 
and/or calculations can be realised very simply by writing scripts that com- 
bine the command-line tool functionality. Here, we present three examples of 
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how the AHE command-hne tools can be combined to compose and deploy 
complex workflows. 

5. 1 Ensembles of simulations 



Grid Resource X 



AHE 




Fig. 1. Running ensembles of application instances on a grid using the AHE. Ap- 
plication instances labelled as siml, sim2,..., simN are started on computational 
resources on the grid using the AHE. The AHE manages input/output file-staging 
for each application instance to/from the associated grid resource. The user is able 
to monitor the status of all the application instances. 

In the workflow depicted in Figure [H the user wishes to launch a large 
number of independent simulations on the grid and retrieve the results 
for post-processing. There may be an additional requirement for these 
simulations to run concurrently. Despite their multiplicity, each simulation 
may be computationally expensive. For high-end applications, the capabilities 
needed can only be delivered by resources available on grids such as the 
UK NGS and US TeraGrid. Computational resources on the grid provide a 
flexible, low cost alternative to intra-institutional supercomputing resources 
for running such ensembles of simulations. Provided sufficient computational 
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resources are available on the grid, deployment of the ensemble workflow via 
the AHE results in shorter turn-around time and greatly enhanced control of 
provenance for the simulation output data as compared to submitting each 
simulation individually on a local computational resource. 

The ensemble functionality is currently being employed to run an en- 
semble of molecular dynamics simulations of the HIV-1 protease and 
inhibitors [38l|, each with a different starting configuration for the system, 
in order to obtain better statistics. Within the ensemble of simulations, the 
initial conditions are identical for the simulations except for the initial atomic 
velocity distributions, sampled from a Maxwell-Boltzmann distribution and 
randomised differently for the same system temperature. 

Such ensembles of MD simulations are useful for thermodynamic inte- 
gration (TI) calculations where multiple MD simulations need to be launched 
at different values of the dual topology parameter A or where multiple short 
trajectory simulations can be launched on a grid to give insight into the 
thermodynamics of a single long trajectory simulation [50]. An alternative 
way to study TI calculations via chained simulations is discussed in Section 

Thus the AHE provides a simple way to write scripts in order to start 
simulations on multiple grid resources. The user needs only to specify the 
configuration file for each simulation; the AHE automatically discovers and 
stages input /output files to/from the remote grid resource without the 
user having to install any complicated grid middleware on his/her local 
machine. This feature is particularly useful when there are a large number 
of simulations to be launched on several resources which span multiple 
administrative domains. 

The AHE server maintains a history of all the simulations that have 
been launched by each user and allows for their monitoring and review in the 
future. As noted previously, the user has the flexibility to change between 
client machines while still having a reference to his/her entire simulation 
history. 

5.2 Chained simulations 

In the workflow in Figure [21 the user wishes to launch a sequence of chained 
simulations where one application instance begins execution after the previous 
instance has finished and may depend on output from the previous instance's 
execution. 

Such a workflow of chained simulations is currently being used by nu- 
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AHE 




Fig. 2. Running a chain of application instances on a grid using the AHE. Apph- 
cation instances labelled as siml, sim2,...., simN are started in sequence on com- 
putational resources on the grid using the AHE. The AHE manages input/output 
file-staging to/from the associated grid resources; piping the output from one appli- 
cation instance as input for the next application instance in the chain. The user is 
able to monitor the status of each application instance as the workflow progresses. 

merous AHE users. In particular, the chained simulation capability is being 
used to study binding affinities of wild-type and mutant HIV-1 proteases 
with drug inhibitors [39]. In order to obtain a starting structure for the MD 
production runs, the initial artificially mutated HIV-1 protease wildtype 
crystal structure is equilibrated by subjecting it to a chain of equilibration 
simulations, each simulation corresponding to a step in the equilibration 
procedure. This is illustrated in Figure [31 The AHE is used to automate the 




Fig. 3. Running a chain of MD simulations on a grid; each simulation corresponds 
to a different step in the initial relaxation of the system to equilibrium before pro- 
duction runs can be performed. 



14 



entire process of launching and managing the equihbration simulations in 
sequence on the grid, thus realising the workflow in Figure [3l The automation 
provided by AHE includes starting each simulation on different grid resources 
in sequence, transferring input files to the grid resource at the start and 
retrieval of output files at the end of each simulation. This feature is particu- 
larly valuable as it frees the user from the mundane task of keeping track of 
all the simulation data files and manually transferring them across the grid 
resources for the next simulation in the chain. Thermodynamic integration 




Run each simulation on the grid 

Fig. 4. Thermodynamic integration calculation consisting of a chain of simulations 
at different values of parameter A is an ideal candidate for grid deployment. See 
text in Fowler et al. |;34J for more details. 

(TI) calculations [Mll^ . where each simulation is run at a particular value of 
the dual topology parameter A (0 < A < 1), such that the initial configuration 
for a simulation at A„ + 1 is obtained from a simulation at A„, is another 
example of a workflow of chained simulations that can be deployed via 
the AHE. This is illustrated in Figure |H As noted in Section 15. H Tl-type 
calculations can also be studied using an ensemble workflow. Additionally, 
one can imagine a chained workflow as in Figure [2] comprising of application 
instances belonging to different application types. 

5.3 Concurrent simulations 

In the workflow in Figure [5l the user wishes to launch concurrent and 
dependent simulations on multiple grid resources. The simulations may 
need to communicate with each other periodically during their execution. 
This type of workflow can be used to perform coupled model simulations 
[37] where the system being simulated spans multiple time-scales and/or 
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AHE 




Fig. 5. Running coupled application instances on a grid using the AHE. Each ap- 
plication instance is executed on a separate grid resource via the AHE. The AHE 
can monitor the individual components as they execute. 

length-scales. In order to accurately and efficiently simulate such systems, 
the different domains of the system are simulated at different levels of detail 
using distinct applications which communicate at regular intervals. See De 
Fabritiis et al. [36] for a description of such a multiscale computational 
technique that has successfully coupled molecular dynamics with Landau's 
fluctuating hydrodynamics to simulate sound waves propagating in bulk 
water and reflected by a lipid monolayer. Although such coupled models have 
not been deployed with the AHE yet, the AHE provides the infrastructure to 
start up the different applications on the grid, and once communication has 
been established and concurrent execution has begun, the AHE will enable 
the user to monitor individual applications and/or to terminate them, while 
at the same time furnishing a high-level overview of the entire coupled model 
application. 

At the time of writing, there is a significant AHE user base with oth- 
ers planning to use it. Favourable experiences have been reported for NAMD 
and LAMMPS applications hosted within the AHE as compared to alternative 
strategies p^l8ll35j for running jobs on grids [38j. As expected, from the 
design criteria, the key benefits for users have been: 

(i) minimal installation, configuration and maintenance effort on the client- 
side; 

(ii) flexibility of restarting or switching between clients so that the client 
need not be 'attached' to the grid in order to launch or manage jobs; 

(iii) automatic staging in of input ffies to a grid resource, third party file 
transfer between resources, and retrieval of distributed ffies to the local 
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machine on job completion; 
(iv) no need to repeatedly submit individual jobs to many resources and 
monitor status across multiple grid resources. 

Users have expressed the need for the capability to monitor queues on differ- 
ent grid resources. This capability can be provided by the AHE in future if 
the queue status information is published via a web service by the resource 
managers installed on the grid resources. 



6 Future Developments 

Future extensions of the AHE aim to add functionality that enables even 
more ambitious computational science projects to be undertaken on grids 
with ease. Globus Toolkit 4 (GT4) [27] and Unicore [T6j are able to process 
jobs specified using the JSDL schema and we are exploring the possibility of 
future versions of AHE being able to directly submit jobs to Globus Toolkit 
4 (GT4) and Unicore without the need for GridSAM (see Appendix [B]) as 
a third party tool to provide the web service interface for job submission. 
Work is underway to integrate the WSRF-compliant RealityGrid steering 
framework into the AHE [35], which will provide support for starting and 
managing steerable applications and coupled model applications hosted in 
the AHE. We plan to extend the current workflow functionality to orchestrate 
even more complex workflows, using the industry-standard BPEL language 
\5T\ . The scientist will be provided with graphical tools to interface with the 
BPEL workflow engine and easily implement their desired BPEL workflow 
operations. 

We are now looking at new platforms on which the lightweight AHE 
client can be deployed. For example, work is underway on an AHE client 
that runs on mobile phones and PDAs [52] allowing scientists to launch 
and monitor simulations on the move. Co-allocation is one of the major 
challenges faced by computational scientists trying to perform cross-site 
runs using multiple concurrently available grid resources |T3j. No automated 
advanced reservation and co-scheduling systems are in place yet within 
so-called production grids; performing cross-site runs [H] today requires 
human intervention for booking resources in advance and making sure these 
are indeed available at the desired tim^. A fault-tolerant web services 
compliant approach to co-allocation called HARC [SS] has recently been 
proposed and implemented. The attractiveness of this scheduler stems from 

^ Note that the beta version of the NAREGI [9] software stack has a super-scheduler 
at the heart of its architecture [53] and claims to implemenent WS- Agreement [54j 
compliant co-allocation. 
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its ability to co-allocate (i) compute resources and (ii) lambda networks [56] . 
which are dynamically provisioned optical networks for bandwidth intensive 
applications. An interface to the HARC system will be developed in the 
AHE client to provide support for automated, fault-tolerant co-allocation. 
We hope to provide support for MPICH-G2 enabled applications in the 
future. Finally, we are also looking into the use of the AHE for accessing 
campus-based resources, often referred to as 'campus grids'. Ultimately, AHE 
could become a uniform interface to all computational resources. 



7 Conclusions 



Our goal in this work has been and continues to be to provide scientists with 
tools such as command-line interfaces that are simple and familiar, while at the 
same time furnishing access to a much more powerful set of resources on grids 
than previously available, allowing scientists to use their own creativity to 
achieve increasingly complex computational objectives. The AHE allows com- 
putational scientists to overcome the barrier of heavyweight, difficult-to-use 
grid middleware and thereby to take advantage of grid computing with much 
greater ease than has been possible hitherto. Grid computing has contributed 
significantly to the scientific programme within isolated, grid-centric computa- 
tional science projects Pll2|13|58|l55] but it still remains under-utilised among 
wider national and international computational science communities. The in- 
volvement of increasing numbers of computational scientists in grid activities 
is essential for determining best practices in grid scheduling policies, provi- 
sioning/orchestration policies and resource configuration. We hope that the 
AHE will help to realise these goals. 



8 Distribution and Installation 



The first release of the AHE was made available on 31 March 2006. The AHE 
(Version 1.0.1) client and server software can be downloaded from the Reality- 
Grid website at http://www.realitygrid.org/AHE or the NeSCForge website at 



http: //forge.nesc.ac.uk/projects/ahe Future releases will continue to be avail- 
able from the RealityGrid website. Documentation, including installation and 
configuration of the AHE to run applications on the UK NGS and US Tera- 
Grid, is also available from these websites. Subscription to the AHE mailing list 
is open at ,http:/ /www.mailinglists.ucl.ac.uk/mailman/listinfo/ahe-discuss , 

A version of the AHE software. Version 1.0.2, has also been developed 
in which the AHE server can be hosted within the Apache Jakarta Tomcat 
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container [60]. This version of the AHE is distributed via the Open Middle- 
ware Infrastructure Institute UK (OMII-UK) [61] within the latest OMII 
3.2.0 server and client software which was released on 12 November 2006. 
This version of the AHE conforms to the OMII Integration Specification and 
is easily installable and integrated with the rest of the OMII distribution 
including WSRF::Lite [62] and GridSAM [31] which are pre-requisites for 
the AHE. OMII 3.2.0 server and client software is available for download at 
[http://www.omii. ac.uk. 
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A Web Services and the Application Hosting Environment 

In the AHE, we have adopted the Service Oriented Architecture (SOA) ap- 
proach to providing applications as services on the grid. Web services are one 
way of realizing SOA. As mentioned in the main text, from a computational 
scientist's perspective a web service is simply any computational functional- 
ity accessible over a network as a service. The term 'service' implies that the 
user simply needs to know the description of the service in order to invoke 
it and does not need to know how the functionality of the service is actually 
implemented; this is particularly important on a heterogeneous grid. Hence, 
in order for the web service to be consumed by a client, it is of utmost im- 
portance that the web services have standards-compliant interfaces. In the 
AHE, we use WSRF::Lite [62], a perl library that provides the framework for 
writing web services. In this Service Oriented Architecture (SOA) approach, 
the key concept is that of loosely coupled components, i.e., web services, that 
interact via SOAP messages and whose interface is described by documents 



19 



conforming to the WSDL standard |22j . 

There are obvious benefits to the adoption of a standards-based, dynamic and 
flexible approach to running applications on heterogeneous grid resources and 
managing the state of the application instances in a uniform, integrated way. 
Such an approach allows flexible integration of multiple applications to accom- 
plish complex computational tasks on the grid, as we aspire to do with the 
AHE's workflow functionality. Also, the widespread adoption of the web ser- 
vices approach in industry and academia permits grid computing practitioners 
to take advantage of numerous existing web services and web services devel- 
opment tools that are independently being made available to the community 
and to take advantage of the significant standardization and inter-operability 
efforts which clearly also address significant issues in grid computing. 

In the AHE, an application instance is represented as a transient stateful web 
service Resource WS-Resource [21]. The WS-Resource properties associated 
with the application instance include 

• the application instance's reference handle or the EndPoint Refer- 
ence (EPR) 

• status of the application instance 

• a trivial name to refer to the simulation 

• date and time when the application instance was started, 

• the grid resource on which its running, 

• names and URLs of the input files 

• names and URLs of the output files. 

Each time an application is run on the grid, a WS-Resource [2T] is created on 
the AHE server-side, and is used to represent that instance of the application's 
execution. This WS-Resource provides an interface for the user to interact with 
the apphcation instance. The WS-Resource corresponding to the application 
instance and its WS-Resource properties are stored on the AHE server in a 
database referred to as the 'App Instance Registry'. This is described in more 
detail in Appendix [Bl The WS-Resource properties can be queried at any 
time using the AHE client or any other web services compliant client. In this 
way the AHE provides a uniform interface for managing simulations of various 
scientific apphcation codes deployed on multiple grid resources. 

The WS-Resource persists even after the application instance has finished exe- 
cuting, providing information on the location of any output files as well as the 
input files and configuration parameters used to initially run the application. 
This is a powerful provenance capability of the AHE, as the user can return 
at a later time and review the simulation by querying the properties of the 
associated WS-Resource. 
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B AHE Architecture 




Fig. B.l. Architecture of the Apphcation Hosting Environment; the numbers denote 
operations performed between the AHE cUent, AHE server, File Staging Service, 
File Store, MyProxy server and Grid Resources. See the text in Appendix [B] for a 
detailed description of this diagram. 

WSRF::Lite 

The AHE is developed with WSRF::Lite, a Perl implementation of the Web 
Service Resource Framework (WSRF [2T1|63] ) specification which has been 
ratified by the OASIS [M] standards body. WSRF::Lite is the follow-on from 
OGSI::Lite, the Perl implementation of the Open Grid Service Infrastructure 
(OGSI [65]) specification from GGF [66]. It is buih on SOAP::Lite [67] the 
Perl module for web services from which it derives its name. 

In the current release of the AHE, we host the WS-Resources on the server- 
side within the Apache web server and store the WS-Resource properties in 
a PostgreSQL [68j database. The PostgreSQL [68] database storing the sim- 
ulation's state can be replicated and backed up non-locally to provide fault 
tolerance. 

GridSAM 

The AHE currently accesses computational resources on the grids via middle- 
ware called GridSAM jST]. GridSAM provides a web services interface, running 



21 



in an OMII [61] web services container, for submitting and monitoring jobs on 
grid resources with various Distributed Resource Managers (DRMs) - Globus 
2.4.3 [15J, Grid Engine 6.0 u4+ [32j and Condor v6.4, v6.6 and v6.7 ^ - 
via its DRM connector-based plugin-in architecture. GridSAM has a plug-in 
architecture that allows connectors for different types of DRMs to be inte- 
grated into it, thereby adding the functionality to submit and monitor jobs 
on resources with those DRMs. 

Computational jobs submitted to GridSAM are described using another 
emerging standard, the Job Submission Description Language. GridSAM con- 
sumes Job Submission Description Language (JSDL [52]) documents and per- 
forms the task of 'job' submission to the target grid resource. 

File Staging Area (FSA) 

The 'File Staging Area' (FSA) is designed to allow the client to use the Web- 
DAV [70] file transfer protocol to 'GET' (step 13 in Figure [H]) or 'PUT' (step 



4 in Figure IB.ip a file on the WebDAV server so that it can be downloaded 
by GridSAM on to the target grid resource (step 10/12 in Figure iRTj) . This is 
designed to handle the case where input files exist on the user's client machine 
and these need to be transferred to the grid resource. Similarly, the FSA is 
useful when files need to be retrieved from the remote grid resource to the lo- 
cal client machine. The FSA provides an intermediate file-staging area, albeit 
one which is accessible via normal HTTP GET and PUT mechanisms and is 
useful for pulling down the simulation input/output files from anywhere on 
the grid using nothing more than a web browser or potentially a PDA. We 
consider this type of file transfer as 'pass by value' since the client actually 
transports the file to the FSA. 

Filestore 

For much larger input /output files, which the user may not wish to download 
to his/her local machine, the AHE supports third-party file transfer using 
a 'Filestore' mechanism. The FileStore is any place on the grid where files 
required by the instance of the application are available through GridFTP or 
HTTP. The FileStore is used to hold large files like checkpoints that would 
not normally be stored on the client machine. The AHE client passes the URL 
[7T] of the source input file or target output file to the AHE server and the 
input files automatically get staged from the FileStore to the grid resource on 
which the simulation is to be run or the output files from the grid resource 
automatically get staged out to the desired target FileStore. These are steps 
10 and 12 in Figure IB.li We call this 'pass by reference' Note that FileStore 
may be a grid resource providing a storage service such as the data nodes on 
the UK NGS. 

My Proxy 

Security is critical in any grid computing environment [1]. We chose not to 
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use WS- Security [72][73] for the AHE because message level security is not 
required: we do not have any intermediaries for relaying SOAP messages. In 
the AHE, communication is via HTTPS to provide Transport Layer Security 
(TLS [2S|); mutual authentication is used between the AHE client and the 
AHE server using X.509 digital certificates [29]. For users of the UK NGS 
[5], digital certificates or e-Science certificates can be obtained from the na- 
tional certification authority [71]. Within the AHE, the App WS-Resource is 
given access to a proxy certificate [29] stored on a MyProxy |75j server. The 
AHE server implements fine-grained authorization in that only the owner of 
the simulation has access to its properties. This can be extended to provide 
group access or open access analogous to the UNIX file system permissions 
for collaborative work. 

App Server Registry 

The 'App Server Registry' maintains a registry of all the applications that are 
hosted within the AHE, such as, DL_POLY [II], NAMD [12], LAMMPS [33], 
GROMACS [3^ and LB3D [H|45ll46j . The user can query the registry (step 
1 in Figure IB.1|) to find the address for the service factory that provides a 
particular application. 

App Server Factory 

The 'App Server Factory' is a web service based on the Factory Pattern [76] 
which creates a new App WS-Resource on the AHE server each time the user 
invokes the Prepare operation. 

App WS-Resource 

The 'App WS-Resource' exists on the AHE server and is a WS-Resource rep- 
resenting a particular application instance. The user invokes the Prepare op- 
eration of the App Server Factory whenever an application instance needs to 
be launched on the grid. For each invocation of the Prepare operation, a new 
App WS-Resource is created on the AHE server and its WS-Resource proper- 
ties are initialized as per the application instance parameters specified by the 
user. 

App Instance Registry 

The AHE server maintains an 'App Instance Registry' which contains 
the history of all application instances/WS-Resources that the user has 
launched/created. The App Instance Registry is implemented by a Post- 
greSQL [68] database. The user can query the App Instance Registry to get a 
list of all the application instances launched and the associated WS-Resource 
properties. Collaborative analysis of simulation histories is greatly aided by 
the stored WS-Resource properties that can be accessed online via the App 
Instance Registry. 

We now briefly describe the numbered operations in Figure IB. 11 
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(1) User retrieves a list of applications hosted within the AHE from the App 
Server Registry. 

(2) User invokes the Prepare operation in step 2 on the App Server Factory. 

(3) As a result of the invocation of the Prepare operation, the App Server 
Factory creates a new App WS-Resource representing the instance of the 
application. The App Server Factory returns a WS- Addressing [77] End- 
Point Reference (EPR) to the client which the client uses to communicate 
with the App WS-Resource. 

(4) The AHE client automatically transfers the user's input files to the File 
Staging Area. 

(5) The user uploads his/her proxy credential to the MyProxy server. This 
is then valid for one week by default. 

(6) The user starts the application instance and the AHE returns the initial 
status of the application instance. 

(7) Once the App WS-Resource is created as a result of the Prepare opera- 
tion, its reference handle and associated properties are stored in the App 
Instance Registry (see 3 above). 

(8) The AHE server creates one or more JSDL documents based on the user's 
specification of the application instance and submits them to GridSAM 
which then starts the application instance on the target grid resource. 

(9) The App WS-Resource uses the proxy certificate for authentication with 
the various grid resources that the user wishes to use. 

(10) The input files are staged from the FSA to the target grid resource on 
which the application instance will be executed. 

(11) The user can use the AHE client to invoke, monitor and terminate com- 
mands on the App WS-Resource in order to check the status and, if need 
be, to terminate the application instance. 

(12) Once the job finishes, the output files are transferred from the grid re- 
source to the FSA. 

(13) The user can use the AHE client to manually or automatically download 
the output files from the FSA. 

(14) The user can query the App Instance Registry to review the history of 
application instances. 
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